Watermarking

ABSTRACT

A watermarketing system and method is proposed for still or moving pictures in which a watermark is embedded robustly and simply into DC-values, but without causing unacceptable visible picture degradation. The watermark is secure and can be readily detected, and the method can be used to convey additional data. Also disclosed is a method for watermarking a sequence of pictures in which the visibility of artifacts is reduced. The watermark may carry data and may be used to label a programme.

The invention relates to embedding a watermark into either a stillpicture or a sequence of moving pictures; this may be used to assist indetection of copying or identification of the originator of material orfor transmitting hidden data. A watermark is a mark or data sequencethat is embedded substantially invisibly into a picture for helping toidentify the originator or the intended recipient of the picture or todetect tampering.

It has been considered to modulate a watermark onto sample mean values,also known as DC-values, but this has tended to result in unacceptablepicture distortion.

Yeung and Mintzer [Journal of Electronic Imaging 7(3), 578-591 (July1998)] proposed a watermarking system in which a watermark is applied toDC coefficients of a JPEG compressed image, This is stated to producesatisfactory results for an RGB image but there was some noticeabledegradation (blockiness) observed when the watermark was applied toluminance values. Investigations by the inventor have suggested thatsuch a technique is unsuited to high quality image reproduction orbroadcast quality motion video marking as the artifacts tend to becometoo visible, and one is generally dealing with a luminance signal ratherthan an RGB signal.

Qiao and Nahrstedt [IEEE International Conference on MultimediaComputing and Systems, Austin Tex. USA, 28th June-1 st July 1998]proposed a watermarking technique in which a watermark is applied to DCTcoefficients in the transform domain. The technique is suitable for MPEGcoded video. This disclosure notes that problems can arise when awatermark is applied to both DC and AC coefficients and suggests that nowatermark should be applied to the DC coefficients. A drawback of thismethod is that the watermark must be applied in the transform domain andthis may require transformation of an image, thereby increasing thecomplexity of the method.

For the above reasons, attempts have concentrated largely on spreadspectrum techniques or embedding information into frequency bands otherthan DC values. The watermark is usually modulated onto a low frequencyband. This makes the watermark more robust against attacks that arebased on low pass signal processing such as data compression or digitalfiltering. This is in line with the properties of the human visualsystem. As a consequence, an attack on a low frequency band forcorrupting the watermark information runs the risk of introducingvisible distortions into the picture. However, for similar reasons,embedding the watermark may present problems; it may either requirecomplex and difficult to reproduce algorithms, may not embed the datareliably, or may introduce visible distortions.

A number of patents assigned to Digimarc Corporation having similardisclosures, of which US-A-5832119 is a representative example, disclosea watermarking method in which a periodically repeating picture isfiltered to remove low and high frequency components to leave onlymid-band components and added to a picture. In addition to the drawbacksidentified above for techniques of this nature, a specific drawback withthis method which the inventor has identified is that the watermark maybecome visible with certain picture types, particularly in flat areas ofthe picture. Due to the complexity of the method and hence processingrequired, the method is unsuited, in practical terms, to marking ofmoving video sequences, particularly of broadcast quality.

The inventor has appreciated that, whilst attempts to modulate awatermark signal onto DC values have generally been unsuccessful for thereasons mentioned above, the use of DC values might offer a robust andsimple watermarking system as compared to conventional techniques, ifthese difficulties could be overcome.

According to a first aspect, the invention provides a method as set outin claim 1.

By making use of a plurality of adjustment factors for each watermarkvalue, each of which adjustment factors is a function of a localestimate of visibility of the watermark within the picture and which isa function of the picture sample values (and substantially independentof the watermark values), it has been found that, surprisingly, thewatermark information can be reliably embedded in the picture withoutcausing unacceptable distortion to the picture. The embedded watermarkvalue may then change the local mean or DC-values of the subset ofpixels in which it is embedded, rendering detection simple and reliable.

It will be appreciated that this method may result in a watermarkedpicture in which watermark information is barely present or even notpresent at all in certain regions of the picture where the estimate ofvisibility suggests that the presence of the watermark is likely tocause visible distortions to the picture.

Preferably the magnitude of adjustment factors is determined from thepicture sample values based on an estimate of visibility, preferablyfrom the local variance. This enables the watermark values to beconcealed effectively. There will be a plurality of values calculatedfor each subset, to take into account picture variation, and there ispreferably an independently determined adjustment value for each picturesample, although calculation of neighbouring adjustment factors mayinvolve some overlap to reduce calculation.

The sign of the adjustment factors is preferably a function of thewatermark values, the watermark values preferably comprising a binarysequence of 0 and 1, being encoded as positive and negative signsrespectively, or vice versa. By changing the sign, a robust codingscheme is provided, the magnitude of the change not being critical indetecting the watermark and hence being adjustable to allow thewatermark to be kept substantially invisible. Alternatively, themagnitude may be adjusted in steps; this may increase the available datacapacity but may increase visibility, reduce robustness or increasecomplexity.

In a development, since measures of visibility may be determined fromthe watermarked picture which should correspond substantially to theoriginally determined measures of visibility, it should be possible at adecoder to determine the available room for data and thus to employdynamic allocation of watermark values or data values by encoding moredata (perhaps 2 or even 3 bits) in a region where the visibilityestimate suggests that larger adjustments can be tolerated. Similarly,decoding may disregard regions where a visibility estimate suggests thatno data will be encoded. Weighted filtering may be employed at adecoder, based on an estimate of visibility.

As an alternative to a binary system using positive and negative signs,a three level system in which zero adjustment is employed could be used;this may not be so advantageous for simple watermarking as zeroadjustments would not assist in correlation, but could be used for datasamples carried within the watermark (as discussed below) to increasethe data capacity.

The adjustment factors may be combined with picture values by adding(which term is intended to encompass weighted addition or subtraction);this is simple to implement but effective, but more complex combinationsuch as averaging may be employed.

Where it is desired for the watermark to be robust, for example to carrydata or to make the watermark difficult to delete so that the source canbe identified, as indicated above, the picture samples are preferablysubstantially adjacent. Small translations and distortions will tend toleave at least some of each subset of samples sufficiently unaffectedfor a determination of the watermark value for that subset. A grid maybe defined, preferably substantially rectangular for ease of processing,although hexagonal or other grid shapes may be employed, and the samplesin each region of the grid may be assigned To a subset. In this way,when it comes to decoding, a slight misalignment of the grid willnormally not prevent the majority of samples within a decoding grid fromcarrying the correct value and permitting correct decoding,

When the picture is to be coded or compressed by an algorithm whichpartitions the picture into blocks, for example JPEG or MPEG coding, thegrid preferably corresponds to blocks or groups of blocks of the codingalgorithm. This may enable efficient processing and may also ensure thatthe watermark is reliably carried (more so than if individual watermarkvalues were assigned to pixels in different blocks).

It has been found that if each subset comprises a block of at leastabout 4 by 4 samples (or 1 6 samples if non-rectangular grouping isused), this provides a much higher degree of robustness against avariety of attacks than a comparative example in which significantlyfewer (or only one) picture samples per watermark value are employed.Preferably, blocks of at least about 8 by 8 samples (or similar size, atleast about 64 samples if non-rectangular grids) are used, morepreferably, at least for broadcast quality images, blocks of at leastabout 8 by 16 (preferably 8 vertically, 16 samples horizontally).

In a preferred development, a restriction condition is applied to thechoosing of watermarks from the available watermarks. This may reducethe amount of data that can be carried but may increase robustness orerror tolerance or detection. In a preferred embodiment, the watermarksare subdivided into subsets (for example 16 watermarks are sub-dividedinto 4 subsets of 4 marks) and a restriction condition related to thesubsets is employed, for example exactly one (or in certain lesspreferred cases another predetermined number, for example in the case oflarger subsets) watermark is chosen from each subset or each of apredetermined number of subsets (for example 1 mark from each of 3 of 4subsets of 4 marks). By applying such a restriction, a measure of errorsor the reliability of the data can be obtained based on the fact thatthere should be a given number (preferably one) of watermarks in eachsubset so if detection yields no marks or a reasonable probability ofmore than one mark in a subset, it can be assumed that the data is noisyor unreliable.

Watermark values may be assigned to substantially the whole of apicture. This may increase the dimension of the watermark and makeunauthorised copying and detection more difficult. However, a repeatingwatermark may be used, or certain portions of the picture may be leftblank.

The watermark may comprise a substantially static component and avariable component, the static component enabling the watermark to bepositively identified, and the variable component carrying additionalinformation, for example one or more of picture (or programme) title,date, author, originator, intended recipient, copying permissions,equipment or recording or coding conditions, user definable data and thelike. Looked at another way, data may be carried with the watermark. Inthe case of a moving picture sequence, a separate hidden data stream maybe carried, some of the watermark assisting in alignment and framing andthe remainder carrying user data. If the application is such that thesource and framing of the sequence can be guaranteed, then no staticwatermark may be needed for synchronisation, and the whole of thewatermark may in fact comprise variable user information.

In certain cases, it may be desirable to make the watermark “fragile”,so that processing of the data can be detected readily by measuringdegradation of the watermark; this may be used for authentication oforiginal copies. This may be achieved by scattering the picture samplevalues of each subset over the picture, and by increasing the number ofwatermark values and decreasing the number of pixels per value, At theextreme, the method may be modified to use only a single picture sampleper watermark value, but this will normally require use of the originalpicture to detect the watermark reliably.

The watermark preferably comprises a pseudo random pattern; this makesit harder for an unauthorised person to detect or apply the watermark.However, in certain applications, a logo or regular pattern may beemployed; this may simplify identification, for example visually from adifference picture.

When the watermark is embedded into a moving sequence, it has been foundthat, surprisingly, although the watermark may be almost impossible todetect when the sequence is viewed frame by frame or in slow motion,artifacts may become visible when the sequence is viewed at normalspeed. Investigation has found that this is due to movement in thepicture causing the effectively static grid corresponding to assignmentof watermark values to appear in. a similar manner to a dirty windowoverlaid on the picture. Further investigations have revealed that otherknown watermarking techniques, when examined closely, are prone tosimilar problems. The prior art does not address these unforseenproblems which are peculiar to moving sequences.

Further aspects of the invention provide methods as set out in claims20, 25 and 26. These may alleviate the problems of watermarking a movingpicture and may be employed in conjunction with the first aspect, orindependently. Preferred features of these aspects are set out in claims21 to 30.

The watermarked picture may be communicated or stored together with datafacilitating identification of the watermark, as set out in claims 31 to33.

The invention further provides a method of testing for the presence of awatermark as set out in claim 34 and the preferred features set out inclaims 35 to 44.

The invention further provides methods of creating, embedding anddetecting data-carrying watermarks, methods of marking moving pictures,applications of marked pictures, computer program products and apparatusfor implementing any of the methods described above or below and furtheraspects and preferred features are set out in the other independent anddependent claims respectively, and may also be found in the followingdescription of a preferred embodiment.

An embodiment of the invention will now be described by way of example,with reference to the accompanying drawings in which:-

FIG. 1 shows a general outline of a watermarking system;

FIG. 2 illustrates partition of a picture into square cells;

FIG. 3 schematically illustrates an example of calculation of variancefor current sample s n and its neighbouring samples that belong To B n;

FIG. 4 illustrates partition of a picture into square cells that carryeither a watermark sample or a data bit;

FIG. 5 illustrates sub-division of a set of 16 watermarks into 4 subsetsof 4;

FIG. 6 shows an 8×8 sample block for carrying data;

FIG. 7 shows the blocks of FIG. 6 tiled m×n;

FIG. 8 shows the tiled blocks of FIG. 7 forming a data watermark spreadacross a picture.

Referring to FIG. 1, the watermark w is embedded into the originalsignal, resulting in the watermarked signal. The watermarked signal maybe changed by friendly attacks that are caused by transmissiontechniques e.g. data compression or by hostile attacks that deliberatelyattempt to remove the watermark. Therefore, the signal ŝ_(w) instead ofS_(w) feeds the input of the watermark-detector. The detector outputs abinary decision which indicates if a given watermark is present in theinput signal or not.

In this description restricted watermarking means that the originalsignal is also needed as an input to the watermark-detector. In thiscase watermark detection is restricted to users who are in possession ofthe original signal, Unrestricted watermarking means that the originalsignal is not needed during detection. Both cases will be addressed.

The watermark is a zero-mean (to prevent a change in global mean(average brightness) of the signal after watermarking) white noiserandom signal that is chosen independently from the original signal anddifficult-to-predict for an attacker. The role of the watermark issimilar to the role of the secret key in a symmetric crypto-system Apseudo random bit generator can be used for obtaining the antipodalseries w(1), . . . w(k), . . . w(K) withw ²(k)=1, k=1, . . . K.   (1)

In restricted watermarking the seed-number of the bit generator candepend on the value of a hash function that is applied to the samples ofthe original signal for providing authentication.

The original picture s is partitioned into cells. This is shown in FIG.2 for a square cell shape. There is no restriction on the cell shape andany other tiling of the x/y-plane can be used just as well, for examplea hexagonal grid. The cell shape and the cell size are parameters thatare kept secret together with the watermark w. Let C_(k) denote the setof indices that select the samples of the k-th cell. The watermark ismodulated onto the sample values ass _(w)(n)=s(n)+w(k)·α²(n), nεC_(k).   (2)

The magnitude depends on a visibility measure and determines by how muchthe amplitude of a sample can be changed without creating a visibledistortion. For example the local variance can be calculated in a smallwindow of 7×7 samples that is centred at the current sample position,see FIG. 3. Let B_(n) denote the set of indices that select the samplesin the neighbourhood including the current index n.

The variance is calculated as $\begin{matrix}{{{var}(n)} = {\left( {\frac{1}{B_{n}} \cdot {\sum\limits_{l \in B_{n}}{s^{z}(l)}}} \right) - \left( {\frac{1}{B_{n}} \cdot {\sum\limits_{l \in B_{n}}{s(l)}}} \right)^{2}}} & (3)\end{matrix}$

Averaging over all samples gives the mean value $\begin{matrix}{\overset{\_}{var} = {\frac{1}{N} \cdot {\sum\limits_{n = 1}^{N}\quad{{var}(n)}}}} & (4)\end{matrix}$

Additionally, the positive modulation index q is introduced for allowinga global control of the energy of the modulated watermark, resulting inthe magnitude $\begin{matrix}{{\alpha^{2}(n)} = \left( {\begin{matrix}{\frac{{2 \cdot {{var}(n)}} + \overset{\_}{var}}{{{var}(n)} + {2 \cdot \overset{\_}{var}}}} \\0\end{matrix}\quad{if}\quad\begin{matrix}{{{var}(n)} > {th}_{flat}} \\{{{var}(n)} \leq {th}_{flat}}\end{matrix}} \right.} & (5)\end{matrix}$

As a consequence of eq. (5) no watermark information is embedded intoflat areas that are detected with a threshold th_(flat).

Eqn. (5) is one example for the calculation of the magnitude α²(n) andmore sophisticated models of the human visual system can be applied incombination with the embedding method that is specified in eq. (2).

After transmission and possible attacks the received signal {hacek over(s)}_(w) is partitioned into cells of appropriate shape and sizecorresponding to the embedding procedure. For each cell a sample meanvalue is calculated, hereafter called DC-value. The DC-value of the k-thcell is calculated as $\begin{matrix}{{{{DC}(k)} = {\frac{1}{C_{k}} \cdot {\sum\limits_{n \in C_{k}}{{\overset{\Cup}{s}}_{w}(n)}}}},} & (6)\end{matrix}$

In restricted watermarking a corresponding DC-value is calculated fromthe original signal s. $\begin{matrix}{{{{DC}_{ori}(k)} = {\frac{1}{C_{k}} \cdot {\sum\limits_{n \in C_{k}}{s(n)}}}},} & (7)\end{matrix}$and in unrestricted watermarking a prediction value is calculated fromthe DC-values specified in eq. (6), $\begin{matrix}{{{DC}_{pred}(k)} = {\frac{1}{C_{k}} \cdot {\sum\limits_{\underset{l = k}{l}}{\beta_{l} \cdot {{DC}(l)}}}}} & (8)\end{matrix}$

The coefficients β₁ are the same for each picture and can be calculatedby linear regression for minimising the mean squared error between DC(k)and DC_(pred)(k). However, in terms of computational complexity asimpler prediction method is to average the DC-values of theneighbouring cells.

In restricted watermarking ΔDC(k)=(DC(k)−DC_(pred)(k)) and inunrestricted watermarking ΔDC(k)=(DC(k)−DC_(ori)(k)) is correlated withthe watermark,${corr} = \frac{\left( {\frac{1}{K} \cdot {\sum\limits_{k = 1}^{K}\quad{\Delta\quad{{{DC}(k)} \cdot {w(k)}}}}} \right) - {\left( {\frac{1}{K} \cdot {\sum\limits_{k = 1}^{K}\quad{\Delta\quad{{DC}(k)}}}} \right) \cdot \left( {\frac{1}{K} \cdot {\sum\limits_{k = 1}^{K}\quad{w(k)}}} \right)}}{1 - \left( {\frac{1}{K} \cdot {\sum\limits_{k = 1}^{K}\quad{w(k)}}} \right)}$

The detector decides upon the presence of the watermark in the signal{hacek over (s)}_(w) if the magnitude of the correlation value exceeds athreshold, |corr|≧th_(detect). The sign sgn(corr) of the correlationvalue signals one hidden data bit if the presence of the watermark isdetected. As no watermark information is embedded into flat areas thedetector has the option to exclude flat areas during the evaluation ofeq. (9).

In restricted watermarking it is easier to align the sampling grid ofthe received signal {hacek over (s)}_(w) relative to the sampling gridthat was used for embedding the watermark. This can be done by comparing{hacek over (s)}_(w) with the original signal s which also allowscompensation for geometric distortions such as scaling or rotation. Inunrestricted watermarking the sampling grids can be aligned by a searchfor maximum correlation among a set of horizontal and vertical offsetvalues that are applied to the sampling grid of {hacek over (s)}_(w). Asevery watermark sample is spread over one cell perfect alignment of thesampling grids is not needed for obtaining a good correlation. Thisproperty improves significantly the robustness against attacks thatre-sample the picture including geometric attacks that introduce anunnoticeable amount of distortion. Additionally, the cell size and thecell shape can be adapted for improving the robustness against specifictypes of geometric distortions. Rotation, cropping or scaling by anoticeable amount can be handled by hypothesis testing which howeverresults in a computational intensive search for maximum correlation.

A common method for increasing the capacity of the hidden data channelis to partition the picture into sub-pictures and to apply the abovewatermarking method to each sub-picture. Another method for increasingthe data capacity shall now be exemplified for a square cell shape, seeFIG. 4. The white cells are used for the watermarking method asdescribed above. Firstly, the watermark is detected from the white cellsat the receiver. This also allows synchronisation and alignment of thesampling grid. Secondly, one data bit is detected from each dark cell asfollows. For the k-th cell the DC-values of eqs. (6)-(8) are calculated.In restricted watermarking, the sign of the differenceΔDC (k)=(DC(k)−DC_(ori) (k))signals the data bit and in unrestricted watermarking the sign of thedifferenceΔDC (k)=(DC (k)−DC_(pred) (k))signals the data bit. At the transmitter the data bit is embeddedsimilar to eqs (2)-(5), the watermark sample is replaced with theantipodal data bit in eq. (2). As no information is embedded into flatareas and prediction can fail in local areas of the picture the detectorhas the option to evaluate the data bit only ifth _(,min)<|ΔDC (k)=(DC (k)−DC_(pred)(k))<th _(,max)this can be taken into account during embedding.

The above method significantly increases the gross data rate. However,robustness is lost in comparison with the watermark that is carried onthe white cells, Therefore, error correcting codes are applied to thedata that is carried on the dark cells.

We have described above methods of increasing data rate. In addition tothe prediction technique already outlined we will describe another wayof increasing the payload. Instead of using one pseudo-random bitpattern as the watermark w a plurality of N substantially statisticallyindependent patterns w₁, . . . , w_(N) is used, The patterns arepreferably statistically independent with the following properties:(1) w ² ₁ =. . . =w ² _(N)=1   (1)(2) E[w ₁ ]=. . . =E[w _(N)]=O, where E is the expectation operator(3) E[w _(k) =w _(n) ]=O if k≠n

Each payload is represented by a combination of three watermarks.Although there could be fewer (for example 2) or more watermarkscombined to increase the data capacity, we have found that,surprisingly, by combining exactly three watermarks from a number(ideally a defined set) of substantially independent watermarks, anoptimum result may be achieved in terms of reliability of detection andincrease in data payload. Thus, there is a total of N·(N−1)·(N−2) /6possible combinations (neglecting trivial or redundant combinations inwhich two or three watermarks are the same or the order of combinationis altered [(which cannot be detected using simple combination ofwatermarks with the picture]). For example, if N=16 there are 560combinations, and one can carry a payload of 9 information bits, plusreserved combinations for other signalling, Surprisingly, we have foundthat the payload may be most advantageously increased by selecting areasonably large number of independent watermarks in this set or“library” and then combining exactly three of these in any one picture(rather than combining larger numbers of watermarks), the maximum sizeof the library being dependent on the ease with which watermarks can bedistinguished, so varying with picture size. In order to ease thenotation let us further assume that the actual payload is represented bythe three watermarks w₁, w₂, w₃. One could then generate a watermark w,by calculating the average value of w₁, w₂ and W₃.(4) w=[w ₁ +w ₂ +w ₃]/3

However, this combination has the property that the expectation value ofthe product of any of the individual watermarks with the combinedwatermark is ⅓.(5) E[w·w ₁ ]=E[w·w ₂ ]=E[w·w ₃]=⅓

A more advantageous way of combining w₁, w₂ and w₃ is,(6) w=[w ₁ +w ₂ +w ₃ ·w ₁ ·w ₂ ·w ₃]/2

The watermark w then has the following properties,(7) w ²=1(8) E[w]=O(9) E[w·w ₁ ]=E[w·w ₂ ]=E[w·w ₃]=½

The watermark w is embedded in the DC-values of the picture in the usualway. For detection of the payload one generates the watermark w¹(10) w′=w ₁ +. . . +w _(N)and cross-correlates the watermark w′ with the DC-values of the picturesin the usual way. If a correlation-peak is detected, the payload isretrieved by cross-correlating each watermark w₁, . . . , w_(N)separately with the DC-values and selecting the three watermarks withthe largest correlation peaks. Although, as mentioned above, it isgreatly preferred if the patterns are independent and satisfy the aboverules, it may be desirable in some cases to use patterns which are nottruly independent but have a low level of cross correlation (this mayincrease the number of patterns that can be used or simplify patternselection); this may make it harder to detect each component reliably,but this may be useful in certain applications, for example where it isintended that the data embedded should be well concealed or “fragile”(i.e. easily corrupted). The above provides a method of combining threebipolar watermarks to produce a single bipolar watermark with theproperty that the product of the combined watermark with each of theconstituent marks has an expectation value of ½.

To increase the reliability of data embedding, the available watermarksmay be sub-divided into subsets, To recap, the method described abovegave a possible 560 combinations, based on the binomial co-efficient$\begin{pmatrix}16 \\3\end{pmatrix} = {\frac{16!}{{3!} \cdot \left( {16 - 3} \right)} = 560}$which, as mentioned, is just over 2⁹.

In a modified proposal, the set is portioned into 4 subsets of 4 marksand exactly one mark is chosen from each of 3 subsets, as schematicallyillustrated in FIG. 5. This gives as a number of possible combinations${\begin{pmatrix}4 \\3\end{pmatrix} \cdot \begin{pmatrix}4 \\1\end{pmatrix}^{3}} = 256$which is exactly 2⁸ so exactly 8 bits can be carried, with increasedrobustness, the first binomial coefficient giving the number of ways ofchoosing 3 subsets from 4 and the second giving the number of ways ofchoosing 1 from 4 watermarks in a subset.

Although the number of bits of information that is conveyed has beenreduced, a detection strategy can be used that can give increasedconfidence in the reliability of any data detected,

When correlation detection is used, the presence of a peak, over acertain threshold, in the correlation surface of the watermark and thepicture (after pre-processing), indicates that the mark has beendetected.

The following procedure may be used to extract the data conveyed by thewatermarks.

1) Check that exactly three subsets have been used.

-   -   Cross-correlate the picture with each of the 16 watermarks in        turn. If there are more or fewer than three cross-correlation        functions with peaks above the detection threshold then no data        can be recovered.

2) Inspect the distribution of the three peaks.

-   -   If there are three peaks, they should each belong to a different        subset of 4 of the 16 possible watermarks. If more than one mark        in a subset has a cross-correlation peak above the threshold        then no data can be recovered.

3) Each peak conveys 2 bits of data, and the set of three conveys 2 bitsof data.

A particular advantage is that a soft decision threshold can be used,when detection falls below the threshold. For example, the three highestpeaks that are derived from different subsets could be used. That is,the threshold can be varied until exactly 3 peaks are detected, with onein each subset, the fact that each mark is in a different subset servingas a check (if the three marks giving the highest peaks are not indifferent subsets, an error can be assumed),

Higher Data Rate Watermarking

A variety of methods with differing data capacities have been discussed.A further embodiment which may allow the amount of data carried to beincreased will now be described, with reference to FIGS. 6-8.

To convey a data payload within a video watermark, such as the systemdescribed previously, the following method may be employed to scatterthe payload data across the picture.

The payload, for example 64 bits having values +1 or −1, may beconfigured as a block of 8×8 bits (as shown in FIG. 6), which may thenbe tiled m×n times to form a data array as in FIG. 7. Bi is one bit inthis 8×8 block.

In the watermark encoder each occurrence of payload data Bi is convolvedwith a pseudo-random sequence of length m×n corresponding to apredetermined key. This is done for each of the 8×8 payload data bits.Each resulting value in this (8×m)×(8×n) array is then used as awatermark. Of course, where different payload sizes and shapes anddifferent tiling patterns are used, the bits of the payload will beconvolved with the bits of the key in an appropriate fashion.Convolution of the data is most preferably performed by a multiplication(considering the data to be signed + or −) or equivalently an XOR [orXNOR] operation (considering the input data to be unsigned). The processmay be explained as follows:- Bipolar multiplication Unipolar XOR DataKey Watermark Data Key Watermark +1 +1 +1 1 1 0 +1 −1 −1 1 0 1 −1 +1 −10 1 1 −1 −1 +1 0 0 0

The watermark is ideally a zero mean, as indicated above, so if thewatermark is generated from a logical XOR, the binary values 0 and 1would in fact be applied to the picture as bipolar values, with 1corresponding to +1 and 0 corresponding to −1 (or vice versa). It willbe noted that the absolute amounts to be added to or subtracted fromeach pixel value may in fact vary from pixel to pixel, based on a localestimate of visibility, as described above; however, whilst this ishighly preferred, the method of generating a watermark which carriesdata can be used in conjunction with another method of embedding thewatermark.

Each bit of the resulting watermark is applied to a block of pixels in apicture (ideally 4×4, which is found to give optimum results in terms ofenabling each data bit to be reliably detected and allowing a large keyand data payload size, but other block sizes and shapes may be used)such that the array is spread over the whole or part of the extent ofthe picture, as shown in FIG. 8. In a most preferred example, for atelevision picture of 576 lines by 720 pixels, this gives 18×22.5 blocksof 8×8 data bits each spread over a block of 4×4 pixels; this allows 405bits for the key (the geometric arrangement is not critical, so the halfblocks can be split over 2 lines). This arrangement is advantageous asthe relatively large key allows reliable detection, but still allows auseful data payload. Of course, other payload sizes and configurationsmay be used, for example 16×8 blocks may carry 128 bits with a 202 bitkey and 16×6 blocks may carry 256 bits, allowing 101 bits for the key;such a key may still be reliably detected in many cases. For example ifthe watermark is applied to a signal as it is transmitted, such a keyshould be detectable at a receiver, for example to enable programmeidentification information to be decoded reliably. The blocks need notbe square, nor even rectangular (though these are most convenient forefficient packing of data, and less vulnerable to corruption of data byresizing operations) or even regular, but any shape which can be appliedmany times over the picture and interlocking or disjointed, butpreferably non-overlapping shapes, may be used. To summarise, the methodof embedding the watermark may be considered in three ways as follows:-

1) Block by Block

The data block is replicated n×m times, so in our example we canconsider a set of 405 8×8 data blocks all containing identical data. 405individual pseudo-random sequences of length 8×8=64 (having values +1and −1) are then convolved with the data in each block.

The blocks are then assembled to make a single rectangular array in someconvenient manner, and each point in that array is expanded to cover, inour example, say 4×4 individual pixels in the picture (for a picture ofnormal European TV resolution, 720×576). This watermark may then becombined with other watermarks, as discussed below.

2) Whole Picture

The picture area is to be considered as groups of say a×b pixels. In ourexample, this is 4×4. The data block is replicated n×m times across theentire picture, such that each element of an 8×m by 8×n (180×144) arraycorresponds to a 4×4 cluster of picture pixels. Special arrangementsmust be made for edge effects if n or m are not integers. In our examplea single pseudo-random sequence of length 8×8×22.5×18=25920 (havingvalues +1 and −1) is then convolved with the 180×144 array.

Each point in the resulting array is expanded to cover, in our example,4×4 individual pixels in the picture (for a picture of normal EuropeanTV resolution, 720×576). This watermark may then be combined with otherwatermarks, as discussed below.

3)Data Bit by Data Bit

Each data bit in the 8×8 data block is replicated n×m times, to create aset of 64 m×n arrays. To take account of the non-integer value of m orn, the resulting array may have varying number of data points on eachline, or may be considered as a set of 64 405-element linear arrays. 64individual pseudo-random sequences of length 22.5×18=405 (having values+1 and −1) are then convolved with each array.

The 64 arrays are then interleaved (every 8 positions vertically andhorizontally, or in some other way) to make a single rectangular arrayof size 180×144 (in our example), and each point in that array isexpanded to cover, in our example, 4×4 individual pixels in the picture(for a picture of normal European TV resolution, 720×576). Thiswatermark may then be combined with other watermarks, as discussedbelow.

The resulting watermark may be applied as a single watermark to thepicture, and detected as for other watermarks by cross correlation withthe key.

It will be appreciated that the three examples mentioned above may beapplied to different watermark and data payload sizes.

Most preferably, using the principles discussed above, and, as discussedfurther below, a further registration watermark is included, at a knownposition with respect to the date-carrying watermark. It is desirable,as discussed above that the registration watermark is substantiallyorthogonal to the data carrying watermark. It will be appreciated thatthe data carrying watermark varies with the data, and it is notpracticable to test every possible data set for orthogonality of theresulting data-carrying watermark with the registration watermark. Itis, however, possible to test each (405 bit or whatever size) keysegment for degree of correlation with the corresponding (i.e.co-located) segment of the registration watermark to ensure a lowcorrelation for each segment. Since the data merely affects the sign ofeach segment, if the correlation is low, it will remain low whatever thedata. Thus, in the ideal case where the correlation between each keysegment and each corresponding watermark segment is exactly zero, thiswill remain true whatever the data, and the sum of the correlations,being the correlation between the complete data-carrying watermark andregistration watermark, will also be zero. It is important to note thatsimply correlating a complete data carrying watermark (for example withdummy date all 1 s) with the complete registration watermark will notgive a reliable test as some segments may cancel by chance, but wouldnot cancel if the data were different.

Turning to the method of detection, this may comprise:-

A) Perform local averaging over each 4×4 (or whatever was used) block ofpixels to extract DC prediction values (as described above, seeparticularly equation 8 above and related description). In a preferredimplementation, we have found that performing 2×2 averaging, determiningDC prediction values for each 2×2 block, and then forming a singleaverage DC prediction value for each 4×4 block may yield better results.This feature may be applied independently, in particular to the basicwatermarking technique mentioned above.

B) For each data bit position within each 8×8 block (or whatever shapeor size was used), determine the corresponding n×m block (or whatevershape was used) or sequence of key bits to correlate with thepredetermined key (or one of a number of predetermined keys see below).A strong positive or negative correlation gives a Positive or negativevalue for the data bit accordingly (or vice versa) and failure tocorrelate above a threshold indicates that the picture may be corrupted,or the data unreliable.

As an optional check, the key values can be re-correlated, using thedetermined data values, to give a further measure of reliability. If thestarting position in the picture was not certain, the process may berepeated for different positions or offsets, to determine the positionwhich gives the maximum correlation.

The key used most preferably varies for each data bit. Looked at anotherway, each key (say 405 bits) can be viewed as a key segment of a largerkey (64*1405 bits). Using a different key for each bit ensures that thedata bits can be distinguished and are not decoded in the wrong sequencedue to shifts in the picture, and also avoids repeating patterns in thewatermark, thereby reducing detectability. On the other hand, in theevent that the data is shifted, the data will then become undetectablewithout searching for a registration point. In a preferredimplementation, an additional registration watermark is provided, asdiscussed below, to facilitate registration: once the position of theknown, higher dimension watermark is determined accurately, the data canbe reliably detected. As with the embedding process, the detectionprocess can be viewed in three ways, as follows:-

1) Block by Block

In the detector the picture frame is processed creating a DC predictionerror matrix of dimensions 22.5×8 by 18×8. This is broken down into 405individual 8×8 blocks, each of which is multiplied by the pseudo-randomsequence with which it was originally convolved. Averaging the 405blocks will produce a single 8×8 correlation matrix, with positive ornegative values in each cell of the matrix corresponding to the sign ofthe original data.

2)Whole picture

In the detector the picture frame is processed creating a DC predictionerror matrix of dimensions 180 by 144. This is multiplied by the samepseudo-random sequence as that with which it was originally convolved.Averaging every eighth value along every eighth line (405 points) foreach position of an 8×8 array will produce an 8×8 correlation matrix,with positive or negative values in each cell of the matrixcorresponding to the sign of the original data.

3)Data Bit by Data Bit

In the detector the picture frame is processed creating a DC predictionerror matrix of dimensions 180 by 144. This is broken down into 64individual m×n arrays by the inverse process by which they wereassembled into a single array, and each of them is multiplied by thepseudo-random sequence with which it was originally convolved. Averagingthe 405 values in each of the 64 blocks will produce 64 correlationvalues, with positive or negative values corresponding to the sign ofthe original data.

It will be seen that all three embedding and detection methods areequivalent and have equivalent effects on a picture, but may differ inthe way in which they are implemented in hardware or software (allaspects and features of the invention may be implemented in either or acombination of both), for example in the loop structure of a softwareimplementation or the processing layout of a hardware implementation.

To improve accuracy and reliability of detection, it is useful toinclude a further watermark, the content of which is fixed, or which hasa larger fixed (key) content and smaller data content, for example thewatermark may be used as one of several (most preferably three)watermarks w_(k), in the manner described above. As stated, thewatermark may be combined with one or more fixed watermarks or furtherdata carrying watermarks. In the latter case, the different datacarrying watermarks are preferably convolved with substantiallyorthogonal keys. In one example, a single fixed watermark, selected froma small group of fixed watermarks. Thus, applying the principlesdescribed above for encoding data by including exactly three fromsixteen watermarks, ideally subdivided into four subsets of fourwatermarks (or whatever other numbers are chosen), in this case, thekeys rather than complete watermarks are chosen as indicated above andthe watermarks themselves are generated by convolving the keys withfurther data. Thus, choosing 3 from 4 subsets of 4 keys allows 8 bits tobe encoded, applying the principles mentioned above. In addition, eachkey is convolved with an 8×8 block of 64 bits of data, giving a total of3*64+8=200 bits. It will be appreciated that certain information will becarried more robustly than other information, and the principle ofcarrying information with differing degrees of robustness may beindependently provided, using other methods for encoding the data. Apreferred implementation has one registration watermark, preferablyselected from a relatively small set of possible watermarks, the choiceof watermark encoding a few bits of information and two data-carryingwatermarks, preferably each using a fixed key and encoding typically 64bits of data each.

References to convolving data with a key as used herein are not limitedto the multiplying and XOR operations described but apply more generallyto combination in which the resulting watermark is a function of boththe data and the key, particularly any form of combining wherein thedata can be extracted from the resulting watermark using the key.

The above described methods for carrying hidden data and watermarkingcan be applied to both still and moving pictures. In the latter casemotion imposes an additional problem in terms of attacks and visibilityof the watermark, and the spatial model of eqs. (3)-(5) may have to beenhanced to a spatio-temporal model, resulting in motion-compensatingembedding and detection. A simpler method is to embed the same staticwatermark only in every n-th (for example every second or third) pictureof a moving sequence and/or to alternate between different staticwatermarks and/or grid patterns.

It is preferable in the case of a moving picture to change the datacarried by the watermark only at certain prespecified points, preferablywhen a shot change is detected or when an accumulated change in picturecontent exceeds a threshold. Preferably, the watermark position is movedat a data change, as discussed below. Preferably a key portion of thewatermark is changed when the data is changed, for example a keysequence may be stepped through. The key sequence may be generated froma pseudo random sequence, preferably having a seed value, preferablyusing a Blum, Blum, Shub random number generator. One or more seed(s)for the random number generator may be communicated to a decoder,preferably embedded in a picture by a method disclosed herein and one ormore seed or feature of the algorithm may be stored in the decoder sothat the decoder may follow the sequence but an unauthorised party notknowing the seed and algorithm cannot easily do so.

To detect such a watermark, it is necessary to search for the positionof the watermark at each shot change. In a detection method which may beprovided independently, the invention provides a method of detecting awatermark in a sequence of moving pictures comprising determining anexpected position of the watermark and thereafter detecting thewatermark based on the expected position, wherein the expected positionis re-determined following a shot change or a change in picture contentabove a threshold,

To reduce the visibility of a watermark in a sequence of movingpictures, it may be desirable to invert the sign of the watermarkbetween pictures each time it is embedded, or according to apredetermined or pseudo-random sequence; in this way the mark will tendto average to zero (not exactly due to picture content modulation) andso will be less visible. This feature may be provided independently orin combination with other features.

The position of the watermark may be moved, particularly if thewatermark is a simple pattern such as a chequerboard or the like. Theposition in the picture can be determined by correlation with a fixedwatermark. The position, or more preferably (this is more rugged),relative movement may itself be used to encode information, for examplewith up, down, left, right being assigned to 2 bit code pairs, or 3 bitsif diagonal movement is encoded. The distance moved may be used toencode further information, although this may be less rugged,particularly if the picture is processed by an effects processor. Eachof these features relating to movement may be provided independently orin combination with other features.

In one preferred implementation, which may be provided independently,the watermark is moved, preferably randomly (preferably based on arandom noise generator rather than a pseudo random sequence generator)substantially at each shot change, or whenever a measure of accumulatedchange in picture content exceeds a threshold. In the case of a datacarrying watermark, preferably the data carried is changed at the sametime as the watermark is moved, preferably every shot change. Suchchanges can assist in reducing detectability by unauthorised persons.

To hide the watermark better, it may be “attached” to a moving objectand may move with the object. For example, using algorithms similar toMPEG-2 algorithms to assign motion vectors to blocks (indeed the samecoding when the picture itself is to be MPEG coded), a watermark orportions of a watermark may be applied to blocks and then move with theblocks. A drawback, however, is that decoding such a picture willnormally require either some information from the original picture orcomplex image processing algorithms (to identify the objects to whichthe watermark is attached). This may nonetheless be useful when thepicture is being compared to an original version of the picture, forexample to detect copying as it will be hard for a pirate to identifyand remove the watermark. This feature may be independently provided.

To explain the watermarking of a moving sequence in more detail, theprocess of averaging many pictures together to produce a single picturehas the effect of reducing the variance of the picture luminance if thepictures are time-varying. If the watermark is static the result is thatthe amplitude of the watermark becomes larger relative to the varianceof the picture luminance. If a sufficiently large number of watermarkedpictures are averaged, the dominant feature will be the watermark. Inpractice this might be of the order of 6000 pictures—4 minutes attelevision rates. This property may be undesirable since it may assistin revealing the ‘secret key’ of the watermark. Another problem that mayarise when pictures that contain motion are watermarked is that thestatic watermark may become more visible when the picture to which it isapplied is moving. The subjective effect is that of seeing the picturethrough a glass screen containing imperfections. This may be undesirablefor two reasons, because it represents a reduction in the quality of thesignals that are produced and because it gives a clue about the natureof the watermarking process that might be useful to malicious persons,in a similar manner to the problem of averaging mentioned above.

Above, motion-compensated embedding is proposed; a motion-compensationmethod will now be described in more detail in which the staticwatermark pattern is moved to follow the average motion of the contentsof the picture, The features to be described, as well as the generalfeature of the watermark following motion in a picture, may be employedindependently of other features and the features are independent of theprecise implementation method used.

To implement this method, as a first step, the average motion of thepicture is determined; this may be achieved by any of a number of knownmethods, such as phase correlation or block matching, each of which haveknown relative advantages and disadvantages and the implementation isnot limited to any particular method. The strength of the watermark ispreferably reduced in areas where the actual motion does not match theaverage motion.

As will be appreciated, the process of determining whether the averagemotion is a good estimate of the actual motion in a particular part ofthe picture can be done in several ways and the embodiment is notlimited to any particular method. One example of a more accurate methodis to divide the picture into smaller blocks and perform ablock-matching or correlation process on pairs of blocks. This kind ofmethod is typically used in MPEG-2 video coders to calculate motionvectors. An example of another method, which is less accurate, butcomputationally far simpler, is simply to compare luminance levels,pixel-by-pixel, between an actual picture and a picture predicted usinga motion estimate. If there is a significant difference, then it isassumed that the motion estimate was not accurate.

Detection typically relies on averaging several pictures and accountmust be taken of the average motion before this. Before the picturesequence is added Together, each picture must preferably be shifted toundo the average motion relative to the first picture. Again, theprecise method of determining average motion is not critical but it isvery highly desirable that the process by which the average motionestimate is made is the same as in the embedding process; otherwiseerrors may be introduced. A specific implementation of the above willnow be described in yet further detail.

An Exemplary Embedding Process is as follows:

1) a watermark W_(k) is embedded in picture P_(k)

2) the average motion, v_(k), from P_(k) to picture P_(k+1) is measured.This may conveniently be achieved by calculating the cross-correlationfunction of picture P_(k) and P_(k+1); this has the benefit of beingrelatively simple to implement and readily applicable in the step ofdetection of the watermark, Another benefit is that the apparatus mayalready include hardware accelerators (or optimised code) forcalculating cross correlation of at least a portion of a picture, forother reasons.

3) a spatially shifted watermark W_(k+1), is calculated by applying acyclic shift to W_(k) according to the motion estimate V_(k)

4) the error, E_(k+1) in the estimated picture is calculated. One waythis may be achieved by applying the same shift to P_(k) to give P⁻_(k+1), end calculating the difference between pixel luminance values inthe estimate P⁻ _(k+1) and the actual P_(k+1). Another, more advanced,method for calculating the error E_(k+1) is to compare for each pixelthe accuracy of the global shift with the estimate of a ‘true’ motionestimator.

5) the watermark W_(k+1) is then modulated according to the error signalE_(k+1) to generate W′_(k+1)=W_(k+1)·f(E_(k+1)).

The function f is preferably chosen to assist in concealing thewatermark in areas where the motion estimate is not accurate and ispreferably, but not necessarily, a non-linear function of the errorsignal. However, the function should also be readily implementable andone simple example is the use of a linear function (C−|E_(k+1)|) where Cis a constant equal to the possible range of the error E. The function fmay also take into account the occurrence of a shot change or of a stillpicture displayed over several frame periods; various functions may bechosen depending on the degree to which concealment is required and thecomputational power available to implement the function.

6) the modulated, shifted watermark W′_(k+1) is embedded into thepicture P_(k+1) substantially as described in detail above for staticpictures. This process is then repeated starting at step 2).

A Corresponding Detection Process is as follows:

1) the average motion, V_(k), from P_(k) to picture P_(k+1) is measured.As above, this may conveniently be achieved by calculating thecross-correlation function of picture P_(k) with P_(k+1), although othermethods may be used in both cases.

2) a spatially shifted version of P_(k+1), P′_(k+1) is calculated byapplying a cyclic shift opposite to v_(k)

3) the spatially shifted picture P′_(k+1) is added to P_(k)

4) the average motion, v_(k+1), from picture P_(k+1), to picture P_(k+2)is measured by calculating the correlation function of picture P_(k+1)with P_(k+2)

5) a spatially shifted picture P′_(k+2) is calculated by applying acyclic shift opposite to (v_(k)+v_(k+1)) to P_(k+2)

6) the spatially shifted picture P′_(k+2) is added to the sum of P_(k)and P_(k+1) Steps 4), 5), and 6) are repeated over the number ofpictures, N, required for an adequate level of detection to produce amotion compensated average picture P″P″=(P _(k) +P′ _(k+1) +P′ _(k+2) +. . . +P′ _(N+k−1))/N

7) the cross-correlation function of the average picture P″ with thewatermark W_(k) is calculated as for a static watermark (it will benoted that this may employ or share some hardware or software with thecross-correlation used to determine motion)

The use of motion-compensated watermarking thus makes embeddedwatermarks difficult to detect by unauthorised individuals and reducespicture degradation.

Preferred Applications

The uses of the watermark fall broadly into two categories, one isidentification, the other is authentication. Identification associatesthe signal with descriptive information. In general, the association ismade by using the watermark to convey a unique identification numberthat points to a record in a database holding more information.Authentication establishes the credibility of a signal, To identify asignal the watermark should be difficult to remove, should be robustagainst distortions and should be difficult to perceive. To authenticatea signal the watermark should be difficult to create (or imitate),should be sensitive to distortions but can be easy to perceive. Thesystem described above is particularly useful for identification, but,as discussed, may be adapted for either application. There are manyscenarios in which the identity of the signal is of significance. Someof these are as follows:

Copyright Protection:

Somebody who has created something of artistic worth might wish to berewarded for the use of their creation. A broadcaster who creates atelevision programme usually retains rights over the programme. Limitedrights to the programme might be sold to another broadcaster permittingsome use of the programme—a defined number of broadcasts in a definedgeographical area perhaps. The broadcaster owning the rights would likeknow that the rights granted are not exceeded. If a watermark is addedto the programme before it is sold, monitoring of broadcasts around theworld can generate reports of detection of the broadcast of theprogramme by virtue of detecting the watermark. The record of thedetections can be analysed to find out whether the rights granted havebeen exceeded. Copyright infringements that might be found in this waywould include the transmission of a programme more times than wereagreed, transmission of the programme in more geographical regions thanwere agreed, or the theft of the programme (either by recording orretransmission of a legitimate broadcast) and use by someoneunauthorised. To identify the route by which a programme came to be usedillegitimately, a watermark can be made specific to an individual copyof the programme. Such a watermark is often called a fingerprint. Thiscan be used to identify the recipient of the programme of which therights were infringed as well as the originator.

Copy Protection:

As a step further on from detection of copyright infringements, awatermark can also be used to control the duplication of material. If arecording system is used that will not record signals in which itdetects a particular watermark, then this provides a means of preventingillegitimate copies of the recording being made by that system. Arelated application would use an authenticating watermark to preventreplay of signals unless the mark were detectable in the signals.

Production Meta-Data Tracking:

The processes that are involved in producing television programmes areoften complex and numerous. To keep track of all the programme materialduring this procedure can be difficult, The descriptive information,called meta-data, associated with the picture or sound signal (such aswhat it is, where it came from, where it has been used, where it will beused) can easily become dissociated from the signal itself. A simpleexample would occur when a self-adhesive label carrying a uniqueidentifying programme number falls off the side of a video-cassette. Anidentifying watermark embedded in the pictures can be used to relate thepictures back to the descriptive information. Data embedded may comprisea SMPTE Unique Material Identifier (UMID) or a subset of the datadefined therein or a proprietary programme identifier such as a BritishBroadcasting Corporation Audio-visual number. Considerations pursuant tothe invention suggest that, for most purposes, a 32 bit number may beadequate. However, 48 bits is preferable and a 64 bit identifier enablesmaximum flexibility of encoding for a variety of producers, users andpurposes. Since it has not hitherto been easy to embed such amounts ofdata in a picture, prior art considerations have concentrated on packingdata into a small identifier, rather than increasing the data payload.

The stage of the production process at which such a watermark isembedded depends of the information tracking requirements. Conceivably,the watermark could be embedded by a camera and could consist of aunique identifier which conveys the time of day and geographicallocation as well as a serial number corresponding to the camera. Thiswould allow the plethora of constituent video clips that are combined tomake a typical television programme to be identified and traced back totheir source at any stage of the process, even, given sufficientrobustness, after broadcast and re-recording.

At each stage of the production process, including final broadcast,every piece of equipment, on detecting the watermark, could update adatabase that records the use of all programme material. This can beused to check that contractual agreements relating to rights are notbeing breached, or to ensure that when material is used anyone entitledto payment for its use is paid.

Each feature disclosed herein may be independently provided, unlessotherwise stated. The appended abstract is incorporated herein byreference.

1. A method of embedding a watermark signal comprising a series ofwatermark values in a picture signal comprising a series of picturesample values, the method comprising adjusting picture sample valuesbased on watermark values, characterised in that adjusting comprises,for each watermark value: combining the watermark value with arespective subset of the picture sample values using a plurality ofadjustment factors, each adjustment factor being based on a localestimate of the visibility of the watermark at a corresponding picturesample location.
 2. A method according to claim 1 wherein the magnitudeof each adjustment factor is a function of the picture sample values,preferably based on the localised variance of the picture sample values.3. A method according to claim 1, wherein the sign of each adjustmentfactor is a function of the watermark values.
 4. A method according toclaim 1, wherein combining comprises adding an adjustment factor to eachpicture sample value.
 5. A method according to claim 1, wherein thepicture sample locations for each said subset corresponding to a givenwatermark value are substantially adjacent.
 6. A method according toclaim 5, wherein a grid dividing the picture into a plurality of regionsis defined and wherein each said subset comprises picture samplescorresponding to a respective region of the grid.
 7. A method accordingto claim 6, wherein the grid is substantially rectangular.
 8. A methodaccording to claim 7, wherein each region corresponds to a block of acoding process, for example MPEG or JPEG compression.
 9. A methodaccording to claim 1, wherein watermark values are assigned tosubstantially all of said regions of the picture.
 10. A method accordingto claim 1, wherein watermark values are assigned to a first group ofsaid regions and sample values of a data sequence are assigned to asecond group of said regions.
 11. A method according to claim 1, whereinthe picture sample locations for each subset are scattered throughoutthe picture.
 12. A method according to claim 1, wherein the watermarksignal comprises a pseudo-random sequence.
 13. A method according toclaim 1, wherein the watermark signal comprises a regular pattern.
 14. Amethod according to claim 1, wherein the watermark signal has asubstantially zero mean whereby the global mean of the picture samplevalues in the picture is substantially unaffected by embedding of thewatermark.
 15. A method according to claim 1, wherein combining isarranged to change the local mean of the picture sample values includedin each said subset, the sign of the change being determined by thecorresponding watermark value.
 16. A method according to claim 1,wherein each subset contains at least 16 picture sample values.
 17. Amethod according to claim 1 wherein the adjustment factors are afunction of a global modulation index variable whereby the energy of thewatermark can be controlled.
 18. A method according to claim 1 whereinsaid adjustment factors are assigned a value substantially equal to zerofor regions having a measure of variance below a predeterminedthreshold.
 19. A method according to claim 1, for embedding a watermarkin a sequence of pictures corresponding to a motion video sequence,wherein the subsets to which watermark values are applied vary frompicture to picture.
 20. A method of embedding a watermark within asequence of pictures corresponding to a motion video sequence whereinthe watermark is combined with picture sample values characterised inthat the method of combining varies from picture to picture to reducethe appearance of static artifacts in the sequence.
 21. A methodaccording to claim 20, for embedding a watermark in a sequence ofpictures corresponding to a motion video sequence, wherein applying thewatermark to the sequence of pictures includes compensating for motionbetween pictures.
 22. A method according to claim 21, wherein motioncompensation includes estimating average motion between pictures.
 23. Amethod according to claim 22, wherein applying the watermark includesdetermining at least a local measure of accuracy of the estimate ofaverage motion.
 24. A method according to claim 23, wherein thewatermark is applied so as to be reduced in visibility in areas of thepicture (in the case of a local measure of accuracy) or in pictures (inthe case of a global measure of accuracy) where said measure indicatesthat said estimate is inaccurate.
 25. A method of embedding a watermarkwithin a sequence of pictures corresponding to a motion video sequence,wherein the watermark is combined with picture sample valuescharacterised in that the method of combining includes generating amotion-compensated version of the watermark to reduce the appearance ofstatic artifacts in the sequence.
 26. A method of embedding a watermarkwithin a sequence of pictures corresponding to a motion video sequence,wherein the watermark is combined with picture sample valuescharacterised in that the method of combining includes estimatingaverage motion between the pictures and combining the watermark toreduce the appearance of static artifacts in the sequence.
 27. A methodaccording to claim 26, wherein a measure of the accuracy of saidestimate is determined and the strength of the watermark is varied as afunction of said measure.
 28. A method according to claim 20, whereinthe watermark is embedded in some but not all pictures of the sequence,preferably wherein at most one in two pictures are watermarked.
 29. Amethod according to claim 20, wherein the pattern of picture samplevalues with which watermark values are combined varies from picture topicture.
 30. A method according to claim 29, wherein embedding thewatermark includes defining a grid dividing the picture into regions andwherein at least one characteristic of the grid, for example shape, sizeor alignment, is varied between pictures of the sequence.
 31. A methodaccording to claim 1 further comprising communicating or storing thewatermarked picture together with information to assist in identifyingthe watermark.
 32. A method according to claim 31 wherein saidinformation comprises a series of local mean values, each mean valuecorresponding to the local mean of a subset of picture sample valuesprior to application of a watermark value.
 33. A method according toclaim 32, wherein the information is compressed, preferably JPEGcompressed.
 34. A method of testing for the presence of a watermarkembedded in a picture signal by a method according to claim 1,comprising receiving the picture signal; enhancing the watermark contentof the received picture signal using local mean picture values;correlating the picture signal or a processed signal derived therefromwith a watermark signal; outputting an estimate of the presence of thewatermark based on the results of said correlation.
 35. (canceled)
 36. Amethod according to claim 34 wherein enhancing, includes filtering basedon received reference picture information.
 37. A method according toclaim 36 wherein the received reference picture information comprisesreference local mean values indicative of local mean values of subsetsof picture samples in the picture prior to watermarking, or compressedinformation from which said reference local mean values can be derived.38. A method according to claim 37 wherein processing includesestimating local mean values indicative of local mean values of subsetsof picture samples in the picture prior to watermarking.
 39. A methodaccording to claim 37, wherein processing includes subtracting thereference or estimated local mean values from local mean valuesdetermined for the received picture signal to give a difference signal.40. A method according to claim 37, wherein a grid is defined dividingthe received signal into regions corresponding to allocation ofwatermark values, wherein local mean values are determined for each ofsaid regions.
 41. A method according to claim 34, further comprisingderiving a series of data values from the received picture signal.
 42. Amethod according to claim 39, wherein the data sample values aredetermined based on the sign of the difference signal in regionscorresponding to allocation of data values.
 43. A method according toclaim 34, wherein said correlating is performed for a plurality ofoffsets and the offset giving the maximum correlation is determined togive a measure of the position of the watermark within the picture. 44.A method according to claim 34, wherein correlating is performed takinginto account possible effects of picture processing operations, forexample rotation, scaling, shifting, cropping or re-sampling operations.45-51. (canceled)
 52. A method according to claim 34 for detecting awatermark embedded by a method wherein the watermark is derived from acombination of a number of substantially independent watermarks orwherein a number of substantially independent watermarks are embedded ineach picture, wherein an estimate of the presence of each of said numberof substantially independent watermarks is produced.
 53. A methodaccording to claim 52, wherein said number of substantially independentwatermarks comprises a subset selected from a defined set ofsubstantially independent watermarks, wherein the picture is crosscorrelated with a composite watermark derived from a sum of each of thewatermarks of said defined set.
 54. A method according to claim 53wherein, at least in the event of cross correlation with said compositewatermark yielding a positive result, the picture is cross correlatedwith each of the watermarks of said defined set.
 55. A method accordingto claim 54, wherein the watermark is derived from a combination ofthree watermarks selected from a defined set of substantiallyindependent watermarks, wherein the three watermarks giving the greatestcross-correlation values are identified.
 56. A method according to claim34, including estimating the cumulative average motion in a sequence ofpictures.
 57. A method according to claim 34, further comprisingcomputing a motion-compensated average picture taking into account theaverage motion in the pictures.
 58. A method according to claim 57,wherein the watermark is cross-correlated with the motion-compensatedaverage picture.
 59. A method of detecting a motion-compensatedwatermark comprising: estimating the cumulative average motion in asequence of pictures; computing an average picture taking into accountthe average motion in the pictures; calculating the cross-correlationfunction of the motion-compensated average picture and the watermark.60. A watermarked picture, a sequence of pictures, a signal or datastorage means containing a picture having a watermark embedded thereinby a method according claim
 1. 61. Apparatus or a computer programproduct arranged to perform a method according to claim
 1. 62. A methodof decoding data in a picture signal comprising determining local meanvalues of picture samples corresponding to regions of the picture inwhich data is carried; comparing said local mean values to estimated orreference local mean values for said regions in the absence of the data,and determining a data value from the result of each comparison, whereinpreferably the data value is determined from at least the sign of thedifference between the determined local mean value and the estimated orreference local mean value.
 63. A method of embedding data comprising aseries of data values in a picture comprising a series of picture valuescomprising defining a plurality of subsets of the picture values, onesubset for each data value, and adding an adjustment factor to eachpicture value in each subset, a first component, preferably themagnitude, of each adjustment factor being a function of an estimate ofthe visibility of embedded data at the picture value location and beingvariable between the picture values of each subset, a second component,preferably the sign, of the adjustment factor being determined by thedata value and being substantially constant for the picture values ofeach subset.
 64. A method of embedding a watermark signal comprising aseries of watermark values in a picture signal comprising a series ofpicture sample values the method comprising adjusting picture samplevalues based on watermark values including generating the watermark byconvolving a key with a repeated data sequence to produce adata-carrying watermark.
 65. A method of generating a watermark encodingdata to be applied to a picture comprising convolving a key comprising aplurality of bits with a plurality of bits of data.
 66. A methodaccording to claim 65, wherein each data bit is convolved withsubstantially an entire key segment having a predetermined length.
 67. Amethod according to claim 66, wherein a different key segment isconvolved with each data bit.
 68. A method according to claim 66,wherein each bit of the watermark is applied to a plurality of bits ofthe picture, preferably a block of at least 4 bits of the picture.
 69. Amethod according to claim 64, wherein a registration watermark isapplied to the picture in addition to the data carrying watermark, tofacilitate decoding of data.
 70. A method of embedding a watermark in amoving picture comprising changing the watermark or moving thewatermark, preferably substantially randomly, at a shot change, orfollowing detection of an accumulated change in picture content above apredetermined threshold.
 71. A method of embedding a data-carryingwatermark in a moving picture comprising changing the data carried bythe watermark at a shot change, or following detection of an accumulatedchange in picture content above a predetermined threshold.
 72. A methodof embedding a data-carrying watermark in a moving picture comprisingmoving the watermark when the data content of the watermark changes,preferably at a shot change or following detection of an accumulatedchange in picture content above a predetermined threshold.
 73. A methodof labeling a frame of a moving picture signal comprising embedding anidentifier of at least 64bits into the picture, by a method according toclaim
 1. 74. A method according to claim 73, for use in labeling abroadcast signal or signal to be distributed wherein the identifierencodes at least one of: the originator of the material, an authorisedrecipient and a material identifier.
 75. A method according to claim 73,for use in labeling a source material in a studio, preferablyimplemented in a camera or recording device, wherein the identifierincludes at least one of: an identifier of the source of the material,preferably including at least one of an identifier of a camera,recording time, location, conditions, and a user-definable label.
 76. Amethod of decoding a data-carrying watermark embedded in a picture, by amethod according to claim 64 comprising: setting a positional reference,based on the position of a registration watermark; estimating watermarkvalues at a plurality of picture locations, preferably by determininglocal average values; based on the estimated watermark values atlocations corresponding to each data bit and a key value correspondingto the location, determining a value for each data bit.
 77. A methodaccording to claim 76, wherein watermark values are determined for aplurality of locations for each data bit and wherein the value for eachdata bit is determined by averaging the product of an estimatedwatermark value and a key value for each of said plurality of locations.78. A method of detecting a watermark in a sequence of moving picturescomprising determining an expected position of the watermark andthereafter detecting the watermark based on the expected position,wherein the expected position is re-determined following a shot changeor a change in picture content above a threshold.
 79. (canceled)
 80. Amethod of embedding data in a picture comprising: generating adata-carrying watermark having a plurality of watermark values byconvolving a set of data comprising a plurality of bits of data with akey comprising a plurality of bits; applying the watermark to thepicture by combining each watermark value with a plurality of picturevalues based on a local estimate of the visibility of the watermark 81.A method according to claim 80 for embedding a data stream in a sequenceof pictures wherein sets of data are generated at intervals from thedata stream and each set is embedded in a plurality of pictures.